\[\huge\textbf{About the Lab Group}\]


  • The LabR Group is comprised of professionals who work in clinical laboratories, specializing

in quality management and statistical tools that enable such management. It is a voluntary and

non-profit group established in 2021 with the aim of applying digital transformation to quality tools

through the R programming language, which is free and open source.

Contents

A. Detailed

1. Information about the study


  • R packages:
##      MethComp       qqplotr      imputeTS      multiway     prettydoc 
##          TRUE          TRUE          TRUE          TRUE          TRUE 
##       moments        mclust          ggQC         stats           ffp 
##          TRUE          TRUE          TRUE          TRUE          TRUE 
## RVAideMemoire      forecast           AID       ggplot2       nortest 
##          TRUE          TRUE          TRUE          TRUE          TRUE 
##     rmarkdown            gt          MASS        plotly        ggpubr 
##          TRUE          TRUE          TRUE          TRUE          TRUE 
##     calibrate      mixtools         knitr    kableExtra       modeest 
##          TRUE          TRUE          TRUE          TRUE          TRUE 
##         dplyr       cluster    factoextra    FactoMineR            DT 
##          TRUE          TRUE          TRUE          TRUE          TRUE 
##   cartography    data.table      devtools      univOutl     multimode 
##          TRUE          TRUE          TRUE          TRUE          TRUE 
##          mixR       refineR    KernSmooth       janitor          utf8 
##          TRUE          TRUE          TRUE          TRUE          TRUE 
##         readr        readxl      openxlsx 
##          TRUE          TRUE          TRUE
##           used  (Mb) gc trigger  (Mb) max used  (Mb)
## Ncells 3760367 200.9    7194617 384.3  5976323 319.2
## Vcells 6130012  46.8   12255594  93.6  8809433  67.3


  • Pessoa responsável: Alan C Dias;

  • Measurement procedure: Information about the measurement procedure;

  • Name of the measurand: testcase1;

  • Unit of measurement: n.a.;

  • Type of blood specimen: Inform the type of blood specimen;

  • Exclusion criteria: Inform the exclusion criteria;

  • Data source: Inform about data source;

  • Faixa etária: Inform the age range;

  • Sex: Inform the sex;

  • Settings:

    • Number of decimal places: 2;

    • Setting the limits of the Reference Interval (‘Double-sided’ or ‘One-sided’): Double-sided. There is a Lower Limit and an Upper Limit of reference.;

    • Manual selection of the number of clusters The number of clusters was automatically chosen by the algorithm.

2. Data preprocessing

Table 2.2. Measures of Position - Part 1.
Statistical parameters Results
Sample size 10000.00
Minimum 0.77
Mode 20.03
Mean 19.99
Median 19.97
Maximum 38.60
Table 2.3. Measures of Position - Part 2.
Statistical parameters Results
Sample size 10000.00
1st percentile 4.04
2.5th percentile 4.89
5th percentile 5.94
10th percentile 8.33
25th percentile 15.57
50th percentile (median) 19.97
75th percentile 24.45
90th percentile 31.82
95th percentile 33.96
97.5th percentile 34.97
99th percentile 35.92
Table 2.4. Measures of Dispersion.
Statistical parameters Results
Sample Size 10000.00
Standard Deviation (SD) 7.69
Variance 59.17
Interquartile Range (IQR) 8.88
Range 37.83
Table 2.5. Outliers removed after 5 cycles of Tukey’s test using the ‘resistant’ method from the ‘univOutl’ package.
Statistics parameters Results
Sample size 10000
Detection of outliers based on Box-Plot 0
Percentage of outliers detected and removed 0
Remaining sample size after removal of outliers 10000


\[\Large\textbf{A) Data cleaning cycles:}\]

  • The algorithm consists of 5 cycles of the Tukey test (k = 2, “resistant” method, package univOutl) to identify and remove outliers;

  • When method = ‘resistant’ from the univOutl package, the outliers are those outside the range specified by eq.(1):

\[\large{[Q_{1} - k \times IQR;Q_{3} + k \times IQR ]}\ \ \ \ \ \ \ \ \ \ {(1)}\]

  • where Q1 is the 1st quartile, Q3 is the 3rd quartile, IQR is the interquartile range (Q3 - Q1), ‘k’ is the non-negative constant that determines the length of the whiskers. When k = 1.5 (called Tukey’s “inner fence”), it is commonly used when drawing a boxplot. When k = 2 and k = 3, they are referred to as “intermediate fence” and “outer fence,” respectively.

\[\Large\textbf{B) Algorithm for data transformation:}\]

  • After the removal of outliers through the application of 5 consecutive cycles of Tukey’s test (k=2 and method=resistant), the ‘Transformation Algorithm’ will be used to choose the best Box-Cox transformation method (loglik and Guerrero) and the best parameter lambda that allows obtaining the best kurtosis and skewness coefficients. The selected method will be used to normalize the remaining data if the prerequisites listed below are met. The ideal value of lambda is the one that results in the best approximation of a normal distribution curve. The lambda results are truncated to -2 to 2 to minimize the probability of over-transformation of the data. The transformation of y is performed using eq. (2):


\[\Large{y^{(\lambda)} = \left \{\begin{matrix} {y_i^\lambda - 1\over \lambda}, & \mbox{if }{ \;\;\lambda\neq0} \\ ln(y_i), & \mbox{if }{ \;\;\lambda=0} \end{matrix} \right.}\ \ \ \ \ \ \ \ \ \ {(2)}\]


  • Prerequisites for using the remaining data transformed by the selected method (loglik or Guerrero) instead of the original remaining non-transformed data:

    • The kurtosis coefficient of the transformed remaining data: >= 2;

    • The skewness coefficient of the remaining original data: <= -0.20 or >= 0.20

    • |Skewness coefficient of the transformed remaining data | < |skewness coefficient of the original remaining data|;

\[\Large\textbf{C) Interpreting the Shape of Data Distributions:}\]

  • Skewness refers to a distortion or asymmetry that deviates from the symmetrical bell curve, or normal distribution, in a set of data. If the curve is shifted to the left or right, it is said to be skewed.


    • Positive skewed distribution (or right-skewed distribution): It is a type of distribution in which most values are clustered around the left tail of the distribution, while the right tail of the distribution is longer.


    • Negative skewness (or left skewness): is a type of distribution in which most of the values are grouped around the right tail of the distribution, while the left tail of the distribution is longer.


  • Kurtosis: similar to “skewness”, kurtosis is a statistical measure used to describe a distribution. While skewness differentiates extreme values in one tail versus the other, kurtosis measures extreme values in both tails. Kurtosis is a measure of the combined weight of the tails of a distribution relative to the center of the distribution.


    • Leptokurtic: A leptokurtic distribution shows heavy tails on both sides, indicating large discrepant values. Theoretically, a leptokurtic distribution is one with kurtosis above 3. A leptokurtic distribution of results above 3.3 will be considered.


    • Mesokurtic: Data that follows a mesokurtic distribution shows an excess kurtosis of zero or close to zero. This means that if the data follows a normal distribution, it follows a mesokurtic distribution. Theoretically, the kurtosis of a normal distribution is equal to 3. A distribution of results between 2.7 and 3.3 will be considered mesokurtic.


    • Platykurtic: the kurtosis reveals a distribution with flattened tails. The flat tails indicate the few outliers in the distribution. Theoretically, a platykurtic distribution is one with kurtosis below 3. A platykurtic distribution of results below 2.7 will be considered.

2.1. Data shape before preprocessing

  • Distribution profile of data before the Outlier Removal Algorithm.

Table 2.6. Data distribution before outlier detection and exclusion (n = 10000 ).
Statistical parameters Results
Kurtosis coefficient 2.74
Interpretation of the kurtosis coefficient The distribution can be considered mesokurtic.
Skewness coefficient -0.01
Interpreting the result of the skewness coefficient The distribution is approximately symmetric.
Footnote:
If the kurtosis is between 2.7 and 3.3, the distribution can be considered mesokurtic.
If the kurtosis is less than 2.7, the distribution is Platykurtic.
If the kurtosis is greater than 3.3, the distribution is Leptokurtic.
If the skewness is less than -1 or greater than 1, the distribution is highly skewed.
If the skewness is between -1 and -0.5 or between 0.5 and 1, the distribution is moderately skewed.
If the skewness is between -0.5 and -0.15 or between 0.15 and 0.5, the distribution is moderately skewed.
If the skewness is between -0.15 and 0.15, the distribution is approximately symmetrical.

2.2. Data shape after preprocessing

  • Distribution profile of the data after the outlier removal algorithm.

Table 2.7. Data shape after outlier detection (n = 10000 ).
Statistical parameters ResultS
Kurtosis coefficient 2.74
Interpretation of the kurtosis coefficient The distribution can be considered mesokurtic.
Skewness coefficient -0.01
Interpreting the result of the skewness coefficient The distribution is approximately symmetrical.
Nota de rodapé:
If the kurtosis is between 2.7 and 3.3, the distribution can be considered mesokurtic.
If the kurtosis is less than 2.7, the distribution is Platykurtic.
If the kurtosis is greater than 3.3, the distribution is Leptokurtic.
If the skewness is less than -1 or greater than 1, the distribution is highly skewed.
If the skewness is between -1 and -0.5 or between 0.5 and 1, the distribution is moderately skewed.
If the skewness is between -0.5 and -0.15 or between 0.15 and 0.5, the distribution is moderately skewed.
If the skewness is between -0.15 and 0.15, the distribution is approximately symmetrical.

2.3. Box-Cox transformation

  • Distribution profile of the data after ‘Box-Cox Transformation Algorithm’, if applicable.

Table 2.7. Data shape after outlier detection (n = 10000 ).
Statistical parameters ResultS
Kurtosis coefficient 2.74
Interpretation of the kurtosis coefficient The distribution can be considered mesokurtic.
Skewness coefficient -0.01
Interpreting the result of the skewness coefficient The distribution is approximately symmetric.
Footnote:
If the kurtosis is between 2.7 and 3.3, the distribution can be considered mesokurtic.
If the kurtosis is less than 2.7, the distribution is Platykurtic.
If the kurtosis is greater than 3.3, the distribution is Leptokurtic.
If the skewness is less than -1 or greater than 1, the distribution is highly skewed.
If the skewness is between -1 and -0.5 or between 0.5 and 1, the distribution is moderately skewed.
If the skewness is between -0.5 and -0.15 or between 0.15 and 0.5, the distribution is moderately skewed.
If the skewness is between -0.15 and 0.15, the distribution is approximately symmetrical.

3. Clustering and truncation


- In the deconvolution of a general mixture of a mixed population, the Expectation-Maximization algorithm (EM) is used, which is an unsupervised clustering method, to estimate the statistical parameters (mean, standard deviation, and proportion or weight) of all components of the mixed data. The estimation of statistical parameters in this case uses the maximum likelihood statistic. Therefore, we can say that the EM algorithm is a way of applying the maximum likelihood statistic to estimate the parameters of each component of the mixture. The EM algorithm is combined with other algorithms that will estimate the global mean, global median, modes, antimodes, kurtosis, and skewness, which will allow identifying the grouping or combination of overlapping groups that best represents the distribution of the reference population, truncating this distribution, and estimating the Lower Limit (LL) and Upper Limit (UL) of the Reference Interval (RI). To do this, the following R packages will be used: ggQC, modeest, multimode, moment, univOutl, mclust, mixR.

Table 3.1. Modes and Anti-modes (n = 10000 ).
Quantitative (n) Modes Anti-modes
1 6.09 9.4
2 19.35 30.57
3 33.71
4
5
6
7
Table 3.2. Statistical parameters estimated by Maximum Likelihood after Box-Cox transformation (n = 10000 ).
Statistical parameters Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7
Global Midpoint (Mode or Median) 20.04 20.04 20.04 N.A N.A N.A N.A
Cluster Mean 6.04 20 33.93 N.A N.A N.A N.A
Cluster Standard Deviation 1.57 4.95 1.49 N.A N.A N.A N.A
Mixture Proportion (Weight) 0.1 0.8 0.1 N.A N.A N.A N.A
Cluster Truncated Distribution (+/- 3SD) 1.33 to 10.75 5.15 to 34.85 29.45 to 38.41 N.A. N.A. N.A. N.A.
Distribuição global truncada (todos os agrupamentos selecionados): 5.15 a 34.85 .
N.A.: Not applicable.
The columns with green background indicate the best clusters used to estimate the ‘Truncated Global Distribution’.

Table 3.3. Shape of the ‘Truncated Global Distribution’ data (n= 9419 ).
Statistical parameters Results
Kurtosis coefficient 2.79
Interpretation of the kurtosis coefficient The distribution can be considered mesokurtic.
Skewness coefficient 0
Interpreting the result of the skewness coefficient The distribution is approximately symmetric.
Footnote:
If the kurtosis is between 2.7 and 3.3, the distribution can be considered mesokurtic.
If the kurtosis is less than 2.7, the distribution is Platykurtic.
If the kurtosis is greater than 3.3, the distribution is Leptokurtic.
If the skewness is less than -1 or greater than 1, the distribution is highly skewed.
If the skewness is between -1 and -0.5 or between 0.5 and 1, the distribution is moderately skewed.
If the skewness is between -0.5 and -0.15 or between 0.15 and 0.5, the distribution is moderately skewed.
If the skewness is between -0.15 and 0.15, the distribution is approximately symmetrical.

4. Estimated reference interval

Table 4.1. Reference Interval estimated by the LabRI Method testcase1 ( n.a. ) - Idade: Inform the age range - Sex: Inform the sex . n = 9419 .
LabRI Method: ‘95% Reference Interval - Bilateral’ ‘90% Confidence Interval (CI) of the Lower Limit’ ‘90% Confidence Interval (CI) of the Upper Limit’
Results: 10.3 a 29.7 10.09 to 10.51 29.49 to 29.91
Footnote:
Comparative Reference Interval: 10.2 a 29.8
Source of the Comparative Reference Interval:
* datset - Package ‘refineR’

5. Indirect verification of the reference interval


  • It is important for laboratories to verify their reference intervals (RIs) before applying them for routine clinical care. This requirement applies to reference intervals derived using the indirect approach.


  • Conventional approach:


    • This can be achieved through the conventional approach where the laboratory analyzes samples from 20 individuals without the predefined condition in the reference population. The reference intervals are considered verified if two or fewer results out of 20 fall outside the RIs that would correspond to a 95% probability.


  • Alternative approach:


    • Alternatively, laboratories may evaluate whether the provided reference interval is appropriate for their patient population and analytical method by monitoring the percentage of abnormal results;**.

    • There are additional approaches for verifying RIs beyond those recommended by CLSI EP28-A3c guidelines. Indirect data mining methods are applied to existing laboratory data to verify the established Reference Interval.


  • Indirect techniques for reference interval verification:


    • Analytical bias;

    • Flagging rates;

    • Sample size verification.


5.1. Analytical bias


A) Midpoint and empirical standard deviation eq. (3) to eq. (5):


\[\large{Midpoint_{IR} = {UL_{RI} + LL_{RI} \over 2}}\ \ \ \ \ \ \ \ \ \ {(3)}\]


\[\large{SD_{e} = {UL_{RI} - LL_{RI} \over 3.92}}\ \ \ \ \ \ \ \ \ \ {(4)}\]


\[\large{SD_{e\ _{combined}} = \sqrt{(SD_{e_{LabRI}})^2 + (SD_{e_{RC}})^2\over2}}\ \ \ \ \ \ \ \ \ \ {(5)}\]


  • \(UL\ _{RI}\): Upper Limit of the Reference Interval.
  • \(LL\ _{RI}\): Lower Limit of the Reference Interval.
  • \(SD_{e}\): Standard Deviation that encompasses the Reference Interval. Also known as empirical standard deviation.
  • \(SD_{e\ _{LabRI}}\): The empirical standard deviation of the LabRI Method.
  • \(SD_{e\ _{RC}}\): Empirical Standard Deviation of the Comparative Reference.
  • \(Midpoint\ _{RI}\): Midpoint of the reference interval.
  • \(SD_{e\ _{combined}}\): Combined Standard Deviations of the LabRI Method and the Comparative Reference.


B) Analytical bias and bias limit eq.(7) a eq.(10):

\[\large{Bias\ limit = 0.375 \times SD_{e\ _{combined}}}\ \ \ \ \ \ \ \ \ \ {(7)}\]


\[\large{|Bias\ Analytical\ _{UL}| = |UL\ _{RI_{CR}} - UL\ _{RI_{LabRI}}|}\ \ \ \ \ \ \ \ \ \ {(8)}\]


\[\large{|Bias\ Analytical\ _{Midpoint}| = |Midpoint\ _{RI_{RC}} - Midpoint\ _{RI_{LabRI}}|}\ \ \ \ \ \ \ \ \ \ {(9)}\]


\[\large{|Bias\ Analytical\ _{LL}| = |LL\ _{RI_{CR}} - LL\ _{RI_{LabRI}}|}\ \ \ \ \ \ \ \ \ \ {(10)}\]


  • \(UL{IR\ _{CR}}\): Upper Limit of the Comparative Reference’s Reference Interval.
  • \(UL{IR\ _{LabRI}}\): Upper Limit of the LabRI Method’s Reference Interval.
  • \(Midpoint\ _{RI\ _{CR}}\): Midpoint of the Comparative Reference’s Reference Interval.
  • \(Midpoint\ _{RI\ _{LabRI}}\): Midpoint of the LabRI Method’s Reference Interval.
  • \(LL_{RI\ _{RC}}\): Lower Limit of the Comparative Reference’s Reference Interval.
  • \(LL_{RI\ _{LabRI}}\): Lower Limit of the LabRI Method’s Reference Interval.


C) Bias Ratio (Verifying the practical relevance or clinical significance) eq.(11) a eq.(13):


  • To evaluate the practical significance of the differences between methods in the Reference Intervals, a Bias Ratio (BR) specified below was calculated at the Lower Limit, Midpoint, and Upper Limit of the estimated Reference Intervals.


\[\large{BR_{UL} = {|Bias\ Analytical\ _{UL}| \over SD_{e\ _{combinado}}}}\ \ \ \ \ \ \ \ \ \ {(11)}\]


\[\large{BR_{Midpoint} = {|Bias\ Analytical\ _{Midpoint}| \over SD_{e\ _{combinado}}}}\ \ \ \ \ \ \ \ \ \ {(12)}\]


\[\large{BR_{LL} = {|Bias\ Analytical\ _{LL}| \over DP_{e\ _{combined}}}}\ \ \ \ \ \ \ \ \ \ {(13)}\]


  • \(BR\ _{UL}\): Bias Ratio of the Upper Limits of the Reference Intervals.
  • \(BR\ _{Midpoint}\): Bias Ratio of the Midpoint of the Reference Intervals.
  • \(BR\ _{LL}\): Bias Ratio of the Lower Limits of the Reference Intervals.
  • \(Analytical\ Bias\ _{UL}\): Analytical Bias of the Upper Limits of the Reference Intervals.
  • \(Analytical\ Bias\ _{Midpoint}\): Analytical Bias of the Midpoint of the Reference Intervals.
  • \(Analytical\ Bias\ _{LL}\): Analytical Bias of the Lower Limits of the Reference Intervals.


Table 5.1. Statistical parameters of the compared reference intervals.
Statistical parameters RI of the Reference Comparative RI of the LabRI Method
Midpoint of the Reference Interval: 20 20.000
Empirical Standard Deviation (DP_e): 5 4.949
Table 5.2. Assessing the analytical (or population) bias between the Upper Limits of the Reference Intervals of the Comparative Reference and the LabRI Algorithm.
Statistical parameters Results of bias between upper limits
Combined DP_e 4.975
Analytical bias 29.8 - 29.7 = 0 4
Bias limit = 0.375 x ‘Combined DP_e’ 0.375 x 4.975 = 1.866
|Bias Ratio (BR)| 0.02
BR limit 0.375
Conclusion The analytical difference between the Upper Limits of the Reference Intervals (Comparative and LabRI) is smaller than the minimum quality specification for bias based on biological variation components.
Table 5.3. Assessing the analytical (or population) bias between the Mean Points of the Reference Intervals of the Comparative Reference and the LabRI Method.
Statistical parameters Results of bias between Mean Points
Combined DP_e 4.975
Analytical bias 20 - 20 = 0
Bias limit = 0.375 x ‘Combined DP_e’ 0.375 x 4.975 = 1.866
|Bias Ratio (BR)| 0
BR limit 0.375
Conclusion The empirical difference between the Midpoint of the Reference Intervals (Comparative and LabRI) is smaller than the minimum quality specification for bias based on biological variation components.
Table 5.4. Evaluation of the analytical (or population) bias between the Lower Limits of the Reference Comparative Reference Interval and the LabRI Method Reference Interval.
Statistical parameters Results of the bias between Lower Limits
Combined DP_e 4.975
Analytical bias 10.2 - 10.3 = -0.1
Bias limit = 0.375 x ‘Combined DP_e’ 0.375 x 4.975 = 1.866
|Bias Ratio (BR)| 0.02
BR limit 0.375
Conclusion The analytical difference between the Lower Limits of the Reference Intervals (Comparative and LabRI) is smaller than the minimum quality specification for bias based on biological variation components.

5.2. Flagging rates

Verification of flagging rates:

  • The expected flagging rates will be compared with their current rates derived from the calculations of the original indirect study. When the increase in flagging in any direction does not exceed the predefined quality targets, the RI under evaluation can be considered acceptable for use. The goal is to assess the flagging rates to determine whether a change in the historical Reference Interval will create higher flagging rates.

  • Based on the principle of the minimum category used to define the allowed bias limits, the flagging rates can reach a value of 5.7% or lower for one of the Reference Limits, while the other limit would have a value of 1 percent or lower

  • Considering this scenario, the flagging rates will be assessed in the result set that includes the best cluster, as we consider that there is a majority of normal results used to estimate the Reference Interval.

Table 5.6. Evaluation of the ‘Flagging Rates’ of Data from the ‘New Truncated Global Distribution’ (n = 9419 ).
Reference Interval (RI) Sample size % TORR % RWRI % BLL % AUL % D-BLL-AUL % S-BLL-AUL
RI of Comparative Reference: 9419 5.81 80.77 9.63 9.60 0.03 19.23
RI of LabRI method: 9419 5.81 80.50 9.77 9.74 0.03 19.51
When the total percentage of values outside the truncated range, %ULL, %AUL, D-ULL-AUL, S-ULL-AUL, present values above 10%, 5.7%, 5.7%, 4.7%, and 6.7%, respectively, it is recommended to review the exclusion criteria applied to the original database or review the partitioning used by sex or age group. In these cases, the cell of the table will have a red background.
Footnote:
%TORR: % Truncated Out of Range Results; %RWRI: % Results Within the Reference Interval; %BLL:% Below the Lower Limit; %AUL: % Above the Upper Limit; %D-BLL-AUL: Difference between the %BLL and the %AUL; %S-BLL-AUL: Sum of the %BLL and the %AUL.

5.3. Sample size verification

  • The number of values (results or individuals selected) directly affects the accuracy of the calculation of reference limits. The calculation of the Confidence Interval (CI) for each reference limit allows for validation of the number of selected individuals (sample size). It is generally accepted that the 90% CI for each reference limit should be < 0.2 times the width of the 95% Reference Interval (RI) in question.


Table 5.7. Validating the sample size.
Statistical Parameters Lower Limit of RI Upper Limit of RI
Width of the 90% Confidence Interval (CI) 10.51 - 10.09 = 0.42 29.91 - 29.49 = 0.42
Width of the 95% Reference Interval (RI) 29.7 - 10.3 = 19.4 29.7 - 10.3 = 19.4
‘CI width’ / ‘RI width’ ratio 0.42 / 19.4 = 0.022 0.42 / 19.4 = 0.022
‘CI width’ / ‘RI width’ ratio limit 0.2 0.2
Footnote:
CI: Confidence Interval. RI: Reference Interval. Results with a ‘CI width’ / ‘RI width’ ratio above 0.2 have the column with a yellow background. In this scenario, the recommendation is to visually assess the distribution profile of the results and check if the estimated Reference Limit leads to a percentage of abnormal results significantly different from the expected one.

Copyright (C) [2023] [LabR Gr]

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

6. References


  • AMEIJEIRAS-ALONSO, J .; CRUJEIRAS, RM; RODRIGUEZ-CASAL, A. multimode: An R Package for Mode Assessment. Journal of Statistical Software , v. 97, n. 9, 2021. Link here.

  • ARSENEAU, E.; BALION, C. M. Statistical methods used in the calculation of geriatric reference intervals: a systematic review. Clinical Chemistry and Laboratory Medicine (CCLM), v. 54, n. 3, p. 377–388, 1 mar. 2016. Link here.

  • BEASLEY, Charles M.; CROWE, Brenda; NILSSON, Mary; et al. Adaptation of the robust method to large distributions of reference values: program modifications and comparison of alternative computational methods. Journal of Biopharmaceutical Statistics, v.29, n.3, p.516-528, 2019. Link here.

  • BENAGLIA, T. et al. mixtools: An R Package for Analyzing Mixture Models. Journal of Statistical Software, v. 32, p. 1–29, 2010. Link here.

  • BRAGA, F.; PANTEGHINI, M. Generation of data on within-subject biological variation in laboratory medicine: An update. Critical Reviews in Clinical Laboratory Sciences, v. 53, n. 5, p. 313–325, out. 2016. Link here.

    BOYD, J. C. Defining laboratory reference values and decision limits: populations, intervals, and interpretations. Asian Journal of Andrology, v. 12, n. 1, p. 83–90, jan. 2010. Link here.

  • BROWNLEE, Jason. Difference Between Algorithm and Model in Machine Learning. Machine Learning Mastery, 28 de abril de 2020. Link here.

  • CASTELLONE, D. D. Establishing reference intervals in the coagulation laboratory. International Journal of Laboratory Hematology, v. 39 Suppl 1, p. 121–127, maio 2017. Link here.

  • CERIOTTI, F.; HENNY, J. “Are my Laboratory Results Normal?” Considerations to be Made Concerning Reference Intervals and Decision Limits. EJIFCC, v. 19, n. 2, p. 106–114, 16 out. 2008. Link here.

  • CLINICAL AND LABORATORY STANDARDS INSTITUTE. Defining, Establishing, and Verifying Reference Intervals in the Clinical Laboratory; Approved Guideline-Third Edition. CLSI document EP28-A3c. Wayne, PA: Clinical and Laboratory Standards Institute; 2008. Link here.

  • COISNON, C. et al. Subjective assessment of frequency distribution histograms and consequences on reference interval accuracy for small sample sizes: A computer-simulated study. Veterinary Clinical Pathology, v. 50, n. 3, p. 427–441, 2021. Link here.

  • DALY, C. H. et al. A systematic review of statistical methods used in constructing pediatric reference intervals. Clinical Biochemistry, v. 46, n. 13–14, p. 1220–1227, set. 2013. Link here.

  • CROSS, M. E.; PLUNKETT, E. V. E. Statistical principles. In: ___ (org.). Physics, Pharmacology and Physiology for Anaesthetists Key - Concepts for the FRCA. 2nd ed. New York: Cambridge University Press, 2014. p. 349-373. Link here.

  • FAVERO, L. P.; BELFIORE, P. Univariate Descriptive Statistics. In: _____.(org.). Data Science for Business and Decision Making 1. ed. Pearson, 2019. cap. 3, p 21-91 Link here.

  • FARRELL, C. L.; NGUYEN, L. Indirect Reference Intervals: Harnessing the Power of Stored Laboratory Data. Clinical Biochemist Reviews, v. 40, n. 2, p. 99–111, 2019. Link here.

  • FERREIRA, C. E. DOS S.; ANDRIOLO, A. Intervalos de referência no laboratório clínico. Jornal Brasileiro de Patologia e Medicina Laboratorial, v. 44, p. 11–16, fev. 2008. Link here.

  • FINNEGAN, Daniel. referenceIntervals: Reference Intervals. Version 1.2.0, License GPL-3, 2020. Link here.

  • FORSMAN, R. W. Why is the laboratory an afterthought for managed care organizations? Clinical Chemistry, v. 42, n. 5, p. 813-816, 1996. Link here.

  • FRALEY, C.; RAFTERY, A. E. How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis. The Computer Journal, v. 41, n. 8, p. 578–588, jan. 1998. Link here.

  • FRALEY, Chris; RAFTERY, Adrian E. Model-Based Clustering, Discriminant Analysis, and Density Estimation. Journal of the American Statistical Association, v. 97, n. 458, p. 1012-10151, 2002. Link here.

  • FRASER, C. G. General strategies to set quality specifications for reliability performance characteristics. Scandinavian Journal of Clinical and Laboratory Investigation, v. 59, n. 7, p. 487–490, jan. 1999. Link here.

  • FRASER, Callum G; PETERSEN, Per Hyltoft; LIBEER, Jean-Claude; et al. Proposals for Setting Generally Applicable Quality Goals Solely Based on Biology. Annals of Clinical Biochemistry: International Journal of Laboratory Medicine, v. 34, n. 1, p. 8–12, 1997. Link here.

  • FORBES, C.; EVANS, M.; HASTINGS, N.; PEACOCK, B. Normal (Gaussian) Distribution. In: _____.(org.). Statistical Distributions 4th ed. Wiley, 2011. cap. 33, p 147-148. Link here.

  • GARCIA, Salvador; LUENGO, Julian; HERRERA, Francisco. Data Preprocessing in Data Mining. Spring, 1st edition, 2015. Link here.

  • GREEN, P. J. Introduction to Finite Mixtures. In: FRUHWIRTH-SCHNATTER, Sylvia.; CELEUX, Gilles; ROBERT, C. P. Handbook of Mixture Analysis. 2nd ed. New York: CRC Press, 2019. p. 3-18. Link here.

  • GUTKIN, Stephen. Writing High-Quality Medical Publications: A User’s Manual. 1st edition, CRC Press, 2018. Link here.

  • HAWKINS, R. C.; BADRICK, T. Reference interval studies: what is the maximum number of samples recommended? Clinical Chemistry and Laboratory Medicine, v. 51, n. 11, p. 2161–2165, nov. 2013. Link here.

  • HAECKEL, R. et al. Critical comments to a recent EFLM recommendation for the review of reference intervals. Clinical Chemistry and Laboratory Medicine, v. 55, n. 3, p. 341–347, 1 mar. 2017. Link here.

  • HAECKEL, R.; WOSNIOK, W. A new concept to derive permissible limits for analytical imprecision and bias considering diagnostic requirements and technical state-of-the-art. Clinical Chemistry and Laboratory Medicine, v. 49, n. 4, p. 623–635, abr. 2011. Link here.

  • HAECKEL, R.; WOSNIOK, W. Observed, unknown distributions of clinical chemical quantities should be considered to be log-normal: a proposal. Clinical Chemistry and Laboratory Medicine, v. 48, n. 10, 1 jan. 2010. Link here.

  • HAECKEL, Rainer; WOSNIOK, Werner; STREICHERT, Thomas; et al. Review of potentials and limitations of indirect approaches for estimating reference limits/intervals of quantitative procedures in laboratory medicine. Journal of Laboratory Medicine, v. 45, n. 2, p. 35-53, 2021. Link here.

  • HARTIGAN, J. A. Testing for Antimodes. In: GAUL, W.; OPITZ, O.; SCHADER, M (org.). Data Analysis - Scientific Modeling and Practical Application. Berlin: Springer, 2000. p. 169-181. Link here.

  • HENNY, Joseph; VASSAULT, Anne; BOURSIER, Guilaine; et al. Recommendation for the review of biological reference intervals in medical laboratories. Clinical Chemistry and Laboratory Medicine (CCLM), v.54, n.12, p.1893-1900, 2016. Link here.

  • HIGGINS, V.; ASGARI, S.; ADELI, K. Choosing the best statistical method for reference interval estimation. Clinical Biochemistry, v. 71, p. 14–16, 2019. Link here.

  • HOLMES, Daniel T; BUHR, Kevin A. Widespread Incorrect Implementation of the Hoffmann Method, the Correct Approach, and Modern Alternatives. American Journal of Clinical Pathology, v. 151, n. 3, p. 328-336, 2019. Link here.

  • HORN, P. S.; PESCE, A. J. Reference intervals: an update. Clinica Chimica Acta, v. 334, n. 1–2, p. 5–23, ago. 2003. Link here.

  • HORN, P. S.; PESCE, A. J.; COPELAND, B. E. A robust approach to reference interval estimation and evaluation. Clinical Chemistry, v.44, n.3, p.622-631, 1998. Link here.

  • HORN, Paul S; FENG, Lan; LI, Yanmei; et al. Effect of Outliers and Nonhealthy Individuals on Reference Interval Estimation. Clinical Chemistry, v. 47, n. 12, p. 2137-2145, 2001. Link here.

  • HORN, Paul S; PESCE, Amadeo J; COPELAND, Bradley E. Reference Interval Computation Using Robust vs Parametric and Nonparametric Analyses. Clinical Chemistry, v. 45, n. 12, p. 2284-2285, 1999. Link here.

  • HUSSAIN, A. Robust outlier detection techniques for skewed distributions and applications to real data. Thesis (Doctor of Philosophy Degree in Econometrics) – International Islamic University. Islamabad. p. 133. 2011. Link here.

  • ICHIHARA, K. et al. A global multicenter study on reference values: 1. Assessment of methods for derivation and comparison of reference intervals. Clinica Chimica Acta; International Journal of Clinical Chemistry, v. 467, p. 70–82, abr. 2017. Link here.

  • ICHIHARA, K.; BOYD, J. C. An appraisal of statistical procedures used in derivation of reference intervals.Clinical Chemistry and Laboratory Medicine, v. 48, n. 11, 1 jan. 2010. Link here.

  • JONES, Graham R. D., Graham R. D., Graham; BARKER, Antony. Standardisation of Reference Intervals: An Australasian View. The Clinical Biochemist Reviews, v.28, n.4, p.169-173, 2007. Link here.

  • JONES, Graham R. D.; HAECKEL, Rainer; LOH, Tze Ping; et al. Indirect methods for reference interval determination - review and recommendations. Clinical Chemistry and Laboratory Medicine, v. 57, n. 1, p. 20-29, 2018. Link here.

  • JONES, Graham Ross Dallas. Validating common reference intervals in routine laboratories. Clinica Chimica Acta; International Journal of Clinical Chemistry, v. 432, p. 119-121, 2014. Link here.

  • KATAYEV, A.M.B.; BALCIZA, C.; SECCOMBE, D. W. Establishing Reference Intervals for Clinical Laboratory Test Results: is there a better way. American Journal of Clinical Pathology, v. 133, p. 180-186, 2010. Link here.

  • KLEE, G. G. et al. Reference Intervals: Comparison of Calculation Methods and Evaluation of Procedures for Merging Reference Measurements From Two US Medical Centers. American Journal of Clinical Pathology, v. 150, n. 6, p. 545–554, 24 out. 2018. Link here.

  • KRAMER, M. S.; FEINSTEIN, A. R. Clinical biostatistics: LIV. The biostatistics of concordance. Clinical Pharmacology and Therapeutics, v. 29, n. 1, p. 111–123, 1981. Link here.

  • LAHTI, A. et al. Objective Criteria for Partitioning Gaussian-distributed Reference Values into Subgroups. Clinical Chemistry, v. 48, n. 2, p. 338–352, 1 fev. 2002. Link em here.

  • LE BOEDEC, K. Sensitivity and specificity of normality tests and consequences on reference interval accuracy at small sample size: a computer-simulation study. Veterinary Clinical Pathology, v. 45, n. 4, p. 648–656, dez. 2016. Link em here.

  • LYKKEBOE, Simon; NIELSEN, Claus Gyrup; CHRISTENSEN, Peter Astrup. Indirect method for validating transference of reference intervals. Clinical Chemistry and Laboratory Medicine, v. 56, n. 3, p. 463-470, 2018. Link em here.

  • MACHIN, David; CAMPBELL, Michael J.; TAN, Say Beng; TAN, Sze Huey. Sample Sizes for Clinical, Laboratory and Epidemiology Studies. 4 ed. Hoboken, NJ : Wiley, 2018. Link em here.

  • MARTINEZ-SANCHEZ, L. et al. Big data e intervalos de referencia: motivacion, practicas actuales, prerrequisitos de armonizacion y estandarizacion y futuras perspectivas en el calculo de intervalos de referencia mediante metodos indirectos. Advances in Laboratory Medicine / Avances en Medicina de Laboratorio, v. 2, n. 1, p. 17–25, 1 mar. 2021. Link em here.

  • Mining Your Routine Data for Reference Intervals: Hoffmann, Bhattacharya and Maximum Likelihood. Link here.

  • MIOT, Hélio Amante. Avaliacao da normalidade dos dados em estudos clinicos e experimentais. Jornal Vascular Brasileiro, v. 16, n. 2, pág. 88-91, 2017. Link em here.

  • Model Based Clustering Essentials. Datanovia, [s.d.]. Link em here.

  • OMUSE, G. et al. Determination of reference intervals for common chemistry and immunoassay tests for Kenyan adults based on an internationally harmonized protocol and up-to-date statistical methods. PLOS ONE, v. 15, n. 7, p. e0235234, 9 jul. 2020. Link here.

  • OMUTO, C.T.; VARGAS, R.R.; EL MOBARAK, A.M.; MOHAMED, N.; VIATKIN, K.; YIGINI, Y. Apendice B: Preguntas frecuentes al implementar R. In:___. (org.). Mapeo de suelos afectados por salinidad - Manual tecnico (Spanish Edition). FAO, Rome, Italy, 2021. p. 101. Link here.

  • OSBORNE, J. W. Extreme and Influential Data Points. In: _____.(org.). Best Practices in Data Cleaning - A Complete Guide to Everything You Need to Do Before and After Collecting Your Data. 1. ed. California: KDPPR, 2012 cap. 7, p 139-168 Link here.

  • OZARDA, Y. et al. A nationwide multicentre study in Turkey for establishing reference intervals of haematological parameters with novel use of a panel of whole blood. Biochemia Medica, v. 27, n. 2, p. 350–377, 15 jun. 2017. Link here.

  • OZARDA, Y. et al. Comparison of reference intervals derived by direct and indirect methods based on compatible datasets obtained in Turkey. Clinica Chimica Acta; International Journal of Clinical Chemistry, v. 520, p. 186–195, set. 2021. Link here.

  • OZARDA, Y. et al. Distinguishing reference intervals and clinical decision limits - A review by the IFCC Committee on Reference Intervals and Decision Limits. Critical Reviews in Clinical Laboratory Sciences, v. 55, n. 6, p. 420–431, set. 2018. Link here.

  • OZARDA, Yesim. Establishing and using reference intervals. Turkish Journal of Biochemistry, v.45, n.1, p.1-10, 2020. Link em here.

  • OZARDA, Yesim; HIGGINS, Victoria; ADELI, Khosrow. Verification of reference intervals in routine clinical laboratories: practical challenges and recommendations. Clinical Chemistry and Laboratory Medicine (CCLM), v. 57, n. 1, p. 30-37, 2018. Link here.

  • OZCURUMEZ, M. K. et al. Determination and verification of reference interval limits in clinical chemistry. Recommendations for laboratories on behalf of the Working Group Guide Limits of the DGKL with respect to ISO Standard 15189 and the Guideline of the German Medical Association on Quality Assurance in Medical Laboratory Examinations (Rili-BAEK). Journal of Laboratory Medicine, v. 43, n. 3, p. 127–133, 1 jun. 2019. Link here.

  • PETERSEN, P. H. et al. Combination of Analytical Quality Specifications Based on Biological Within- and Between-Subject Variation. Annals of Clinical Biochemistry, v. 39, n. 6, p. 543–550, 1 nov. 2002. Link here.

  • PETERSEN, Per H; LUND, Flemming; FRASER, Callum G; et al. Analytical performance specifications for changes in assay bias for data with logarithmic distributions as assessed by effects on reference change values. Annals of Clinical Biochemistry, v. 53, n. 6, p. 686-691, 2016. Link here.

  • PINHEIRO, J. I. D. et al. Analise exploratoria de dados amostrais. In: _____ Probabilidade e estatistica: Quantificando a incerteza de medicao. 1. ed. Rio de Janeiro: Elsevier, 2012. cap. 7, p 235-290. Link here.

  • POOLE, S.; SCHROEDER, L. F.; SHAH, N. An unsupervised learning method to identify reference intervals from a clinical database. Journal of biomedical informatics, v. 59, p. 276–284, fev. 2016. Link here.

  • RONDA, F. G. et al. Key questions about the future of laboratory medicine in the next decade of the 21st century: A report from the IFCC-Emerging Technologies Division. Clinica Chimica Acta, v.495, p. 570-589, 2019. Link here.

  • SCHLATTMANN, Peter. Introduction - Heterogeneity in Medicine In: ___ (ed.). Medical Applications of Finite Mixture Models (Statistics for Biology and Health). Berlin: Springer, 2009. p. 7-28. Link here.

  • SCRUCCA, L. et al. mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models. The R Journal, v. 8, n. 1, p. 289–317, ago. 2016. Link here.

  • SERDAR, Ceyhan Ceran; CIHAN, Murat; YUCEL, Dogan; et al. Sample size, power and effect size revisited: simplified and practical approaches in pre-clinical, clinical and laboratory studies. Biochemia Medica, v. 31, n. 1, p. 010502, 2021. Link here.

  • SHINE, B. Use of routine clinical laboratory data to define reference intervals. Annals of Clinical Biochemistry, v. 45, n. Pt 5, p. 467–475, set. 2008. Link here.

  • SIKARIS, Kenneth A. Separating disease and health for indirect reference intervals. Journal of Laboratory Medicine, v. 45, n. 2, p. 55–68, 2021. Link here.

  • SMIT, F. C. et al. Establishment of reference intervals of biochemical analytes for South African adults: a study conducted as part of the IFCC global multicentre study on reference values. Journal of Medical Laboratory Science and Technology of South Africa, v. 3, n. 1, p. 8–23, 31 maio 2021. Link here.

  • SOLBERG, H. E. Approved recommendation (1987) on the theory of reference values. Part 5. Statistical treatment of collected reference values. Determination of reference limits. Clinica Chimica Acta, v. 170, n. 2, p. S13–S32, 1 dez. 1987. Link here.

  • SOLBERG, H. E.; LAHTI, A. Detection of Outliers in Reference Distributions: Performance of Horn’s Algorithm. Clinical Chemistry, v. 51, n. 12, p. 2326–2332, 1 dez. 2005. Link here.

  • TATE, Jillian R.; KOERBIN, Gus; ADELI, Khosrow. Opinion Paper: Deriving Harmonised Reference Intervals - Global Activities. EJIFCC, v. 27, n. 1, p. 48-65, 2016. Link here.

  • TATE, Jillian R; SIKARIS, Ken A; JONES, Graham RD; et al. Harmonising Adult and Paediatric Reference Intervals in Australia and New Zealand: An Evidence-Based Approach for Establishing a First Panel of Chemistry Analytes. The Clinical Biochemist Reviews, v. 35, n. 4, p. 213-235, 2014. Link here.

  • TATE, Jillian R; YEN, Tina; JONES, Graham R D. Transference and Validation of Reference Intervals. Clinical Chemistry, v. 61, n. 8, p. 1012–1015, 2015. Link here.

  • THRUN, Michael C.; GEHLERT, Tino; ULTSCH, Alfred. Analyzing the fine structure of distributions. PLOS ONE, v. 15, n. 10, p. e0238835, 2020. Link here.

  • TUKEY, J. W. SCHEMATIC SUMMARIES (pictures and numbers). In: _____.(org.). Exploratory Data Analysis 1. ed. Rio de Janeiro: Pearson, 1977. cap. 2, p 27-56. Link here.

  • WELLEK, S. et al. Determination of reference limits: statistical concepts and tools for sample size calculation. Clinical Chemistry and Laboratory Medicine, v. 52, n. 12, p. 1685–1694, dez. 2014. Link here.

  • XIE, Yihui; DERVIEUX, Christophe; RIEDERER, Emily. Master Machine Learning Algorithms: Discover How They Work and Implement Them From Scratch. Machine Learning Mastery, 1st edition, 2017. Link here.

  • Yu, Youjiao. An introduction to mixR. 2021. Link here.

  • ZIERK, Jakob; ARZIDEH, Farhad; KAPSNER, Lorenz A.; et al. Reference Interval Estimation from Mixed Distributions using Truncation Points and the Kolmogorov-Smirnov Distance (kosmic). Scientific Reports, v. 10, n. 1, p. 1704, 2020. Link here.





  • Information about the processing and analysis time of the data by the Reference Interval tool:

## [1] "Starting date and time:"
## [1] "sábado, 20 maio, 2023, 11:13:20"
## [1] "End date and time:"
## [1] "sábado, 20 maio, 2023, 11:14:23"
## [1] "'End date and time' - 'Starting date and time'="
## Time difference of 1.05 mins

B. Summary

\[\huge\textbf{Information about the study}\]

  • Responsible person: Alan C Dias;

  • Measurement procedure: Information about the measurement procedure;

  • Name of the measurand: testcase1;

  • Unit of measurement: n.a.;

  • Type of blood specimen: Inform the type of blood specimen;

  • Exclusion criteria: Inform the exclusion criteria;

  • Data source: Inform about data source;

  • Age range: Inform the age range;

  • Sex: Inform the sex;

  • Settings:

  • Manual selection of the number of clusters: The number of clusters was automatically chosen by the algorithm.

\[\huge\textbf{Estimated Reference Interval}\]

Table 4.1. Reference Interval estimated by the LabRI Method testcase1 ( n.a. ) - Idade: Inform the age range - Sex: Inform the sex . n = 9419 .
LabRI method: ‘95% Reference Interval - Bilateral’ ‘90% Confidence Interval (CI) of the Lower Limit’ ‘90% Confidence Interval (CI) of the Upper Limit’
Results: 10.3 a 29.7 10.09 to 10.51 29.49 to 29.91
Footnote:
Comparative Reference Interval: 10.2 a 29.8
Source of Comparative Reference Interval (RI):
* datset - Package ‘refineR’

\[\huge\textbf{Indirect verification of the Reference Interval}\]


Table 5.6. Evaluation of the ‘Flagging Rates’ of Data from the ‘New Truncated Global Distribution’ (n = 9419 ).
Reference Interval (RI) Sample size % TORR % RWRI % BLL % AUL % D-BLL-AUL % % S-BLL-AUL
RI of Comparative Reference: 9419 5.81 80.77 9.63 9.60 0.03 19.23
RI of LabRI method: 9419 5.81 80.50 9.77 9.74 0.03 19.51
%TORR: % Truncated Out of Range Results; %RWRI: % Results Within the Reference Interval; %BLL:% Below the Lower Limit; %AUL: % Above the Upper Limit; %D-BLL-AUL: Difference between the %BLL and the %AUL; %S-BLL-AUL: Sum of the %BLL and the %AUL.
Footnote:
When the total percentage of values outside the truncated range, %ULL, %AUL, D-ULL-AUL, S-ULL-AUL, present values above 10%, 5.7%, 5.7%, 4.7%, and 6.7%, respectively, it is recommended to review the exclusion criteria applied to the original database or review the partitioning used by sex or age group. In these cases, the cell of the table will have a red background.
Table 5.7. Validating the sample size.
Statistical Parameters Lower Limit of RI Upper Limit of RI
Width of the 90% Confidence Interval (CI) 10.51 - 10.09 = 0.42 29.91 - 29.49 = 0.42
Width of the 95% Reference Interval (RI) 29.7 - 10.3 = 19.4 29.7 - 10.3 = 19.4
‘CI width’ / ‘RI width’ ratio 0.42 / 19.4 = 0.022 0.42 / 19.4 = 0.022
‘CI width’ / ‘RI width’ ratio limit 0.2 0.2
Footnote:
CI: Confidence Interval. RI: Reference Interval. Results with a ‘CI width’ / ‘RI width’ ratio above 0.2 have the column with a yellow background. In this scenario, the recommendation is to visually assess the distribution profile of the results and check if the estimated Reference Limit leads to a percentage of abnormal results significantly different from the expected one.